Exploiting Audio-visual Correlation in Coding of Talking Head Sequences
نویسندگان
چکیده
TALKING HEAD SEQUENCES Ram R. Rao Georgia Institute of Technology Atlanta, GA 30332 [email protected] Tsuhan Chen AT&T Bell Laboratories Holmdel, NJ 07733 [email protected] ABSTRACT In this paper, we present a novel means for predicting the shape of a person's mouth from the corresponding speech signal and explore applications of this prediction to video coding. One possible application is cross-modal predictive coding. In the cross-modal predictive coding system described in this paper, a model-based video coder compares measured visual parameters with predicted visual parameters, and sends the di erence between the two to the receiver. Since the decoder also receives the acoustic data, it can form the prediction and then reconstruct the original parameters by adding the transmitted error signal.
منابع مشابه
A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis
In this work, we present a framework for generating a video-realistic audio-visual “Talking Head”, which can be integrated in applications as a natural Human-Computer interface where audio only is not an appropriate output channel especially in noisy environments. Our work is based on a 2D-video-frame concatenative visual synthesis and a unit-selection based Text -to-Speech system. In order to ...
متن کاملData-Driven Tools for Designing Talking Heads Exploiting Emotional Attitudes
Audio/visual speech, in the form of labial movement and facial expression data, was utilized in order to semi-automatically build a new Italian expressive and emotive talking head capable of believable and emotional behavior. The methodology, the procedures and the specific software tools utilized for this scope will be described together with some implementation examples.
متن کاملAudio-visual speech asynchrony modeling in a talking head
An audio-visual speech synthesis system with modeling of asynchrony between auditory and visual speech modalities is proposed in the paper. Corpus-based study of real recordings gave us the required data for understanding the problem of modalities asynchrony that is partially caused by the coarticulation phenomena. A set of context-dependent timing rules and recommendations was elaborated in or...
متن کاملText Driven 3D Photo-Realistic Talking Head
We propose a new 3D photo-realistic talking head with a personalized, photo realistic appearance. Different head motions and facial expressions can be freely controlled and rendered. It extends our prior, high-quality, 2D photo-realistic talking head to 3D. Around 20-minutes of audio-visual 2D video are first recorded with read prompted sentences spoken by a speaker. We use a 2D-to-3D reconstru...
متن کاملThe Development of a Brazilian Talking Head
This paper describes partial results of a research, in progress at the School of Electrical and Computer Engineering of the State University of Campinas, aimed at developing a realistic three-dimensional Brazilian Talking Head. Through an extensive analysis of a video-audio linguistic corpus, a set of 29 phonetic context-dependent visemes (22 consonantal plus 7 vocalic visemes), that accommodat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996